import numpy as np
import matplotlib.pyplot as plt
When you compute a ratio of two random numbers, the ratio is biased in general, unless the relative error on the denominator is very small. This is a general consequence of the fact that any non-linear transformation of a random variable introduces a bias in general.
The good news is that we can correct for this bias to first order with a simple formula, and thus get an approximately unbiased result.
= np.random.default_rng(1)
rng
= 100.
a_true = 20.
b_factor
= rng.poisson(a_true, size=10000)
a = rng.poisson(a_true / b_factor, size=len(a)) * b_factor
b ==0] = np.nan b[b
/ b, bins=50, label="toys")
plt.hist(a 1, ls="--", color="k", label="true")
plt.axvline(/b), ls="-", color="r", label="avg")
plt.axvline(np.nanmean(a; plt.legend()
@np.vectorize
def ratio_bias(a, a_var, b, b_var):
if np.isnan(a) or np.isnan(b) or a == 0 or b == 0:
return np.nan
= a_var / a
scale_a = a / scale_a
n_eff_a = b_var / b
scale_b = b / scale_b
n_eff_b
= np.random.default_rng(1)
rng = rng.poisson(n_eff_a, size=10000) * scale_a
a_toy = rng.poisson(n_eff_b, size=len(a_toy)) * scale_b
b_toy == 0] = np.nan
b_toy[b_toy = np.nanmean(a_toy / b_toy)
r_toy return r_toy / (a / b)
= 10.
a = 10.
b
for a_rel_err in (0.01, 0.5):
= np.geomspace(0.01, 0.5, 10)
b_rel_err = ratio_bias(a, (a * a_rel_err) ** 2, b, (b * b_rel_err) ** 2)
bias =f"$\\sigma(a)/a = {a_rel_err:.2f}$")
plt.plot(b_rel_err, bias, label
# analytical formula for bias
= (1+b_rel_err**2)
bias =f"analytical")
plt.plot(b_rel_err, bias, label
r"$\langle a/b \rangle$ / truth")
plt.ylabel(r"$\sigma(b)/b$")
plt.xlabel(; plt.legend()
The analytical formula is a first order correction, therefore it fails to describe the bias for large relative uncertainties.